This initial version of provisioning will have a more limited set of implemented use cases then the original policy driven effort. This document should lay out the use cases expected in this first version and the workflows they imply.
The application architect or engineering services team will provide the definition and content of my application/project/business service to the RHQ provisioning system allowing it to be deployed as part of the development lifecycle. This is assuming a typical JEE application with JBoss EAP or EWS. Most likely at least one database is involved and potentially other resources such as ldap, mail servers or web services.
The application construction and configuration will be reviewed to decide on the deployment file layout and the files that must be configured for specific environments
The application recipe will be written to define this model and will be tested in development deployments to ensure that the application is fully deployed and working (including upgrade and clustering scenarios when appropriate)
Check that the service definition configuration will be valid for all environments (shared service installation and definition)
The environment will be defined and documented with the set of configurations needed for deployment to other environments (e.g. qa, stage, perf and prod)
Will we need to let the bundle provider document the meaning of a configuration property? (Now which database is this property supposed to point to?)
Deployments to QA using alternate infrastructure will provide proof that nothing is hard coded for dev deployment scenarios
Integrate bundle construction into build system. Track recipe in SCM. Build bundle package.
Update bundle in RHQ with up-to-date automatically build bundle and release information.
The recipe should be a system of storing bundle data outside of the RHQ database. RHQ can not be the system of record for the bundles themselves. This implies a certain amount of data that the bundle should include and potentially argues for another abstraction layer between the recipe and the system such as a bundle metadata file. The following data is needed for a bundle.
Name, description, (tags?)
Version
compatibility, requirements, dependencies (e.g. supported os'es, disk space reqs)
if bundles might be shared publicly
More info, web site links
license
Questions for property replacement / environment
Assume the machines to be deployed to are already in inventory
The deployer will review the deployment plan of machines to deploy against
Possibly need to create a group for those machines to allow group deployments
Review the state and utilization on those machines (multiplexed deployments are the common case)
Review other utilization like ports on those machines
Review dependent infrastructure and predeploy
Is the load balancer in place and ready for configuration?
Is the database up and available from the machines to be deployed to and running the proper schema?
Are LDAP, webservice dependencies and other services in place?
Define deployment configurations
Set the production database password
Set the production admin domain password
Review predeploy plan
What machines is this going out to and what affect will the deployment have on them?
Is there enough memory available on the machines to be deployed to? How about disk space? Network utilization?
What will the configuration files look like once processed for deployment?
Verify service functionality
Check deployment validity, review start logs
Confirm function (on all cluster nodes) and access through the load balancer
Confirm dependent functionality (ldap, web services, etc)
Import and confirm inventory integration into RHQ for monitoring
Setup RHQ monitoring. Add alert conditions. Define cluster groupings
A new build is available from development and is provided by release engineering into the RHQ bundle system as an update
Confirm recipe validity in dev testing. (Recipe should be kept up to date as development alters configuration requirements)
Add any necessary deployment documentation to the recipe
Bundle version is marked as ready for further deployment
As a new version is tested through the lifecycle... how do we keep the prod deployer from accidentally pushing it to production?
How are notes passed on the deployments between developer, release engineering and deployment?
A deployer is updating a production deployment and has confirmation on the bundle version to be released to that environment
Confirm that any dependency changes that are different are reviewed (is that new web service dependency up?)
It's a load balanced application? Are we putting up the "down for upgrade" message?
This release also requires a schema upgrade for the application. Are we shutting down the old app and waiting for the upgrade to be run by DBAs?
What if the new release will be adding a few nodes to the cluster?
Jump to step 5 of the first time deployment
Deployment does not start properly
A port conflict
File permissions fail
db schema not upgraded successfully
Dependent service not upgraded in sync
gremlins
How do i get to both the deployment logs and the start logs for the services i deployed?
Let the recipe define "important paths" where detected file changes could be dangerous. (e.g. a new file in the deploy directory is bad... one in the work directory is ok)
Reverting of deployments is necessary (particularly in deployment failures)
This goes back to the state right before the deploy (not the previously installed version)
Either we save an entire backup, or more likely, we install the previous version and add in the backed up files
For those who won't have a feed of their builds into the bundle system, a user will have to upload specific releases into the system. For bundles to work outside the rhq deployment path more information will also be needed. Local inventory implies version info should be in the bundle so a local deployment can properly inventory itself. An answer file format should be supported for testing deployments or for export/import for other environments.
Uploading a bundle package should either create a new bundle and completely set it up or add a new version to an existing bundle. The wizard would allow the user to confirm the operation to be executed.
Deployment Defs should be managed separately to allow sys admins to define their environments and the question responses for a bundle. It'd be really useful for these def's to support an environment tagging system so there can be correlation between different defs of the same environment (dev, test, stage, prod, dr).
A major question exists on how to manage what environments a bundle version is ready for? A typical five environment system will often have 2 to 3 active versions between the environments and tracking where a version belongs at any given point is critical. Otherwise they'll accidentally push version 1.1 to prod which is still in early testing instead of that 1.0.1 patch that was already confirmed for deployment. A separate workflow should be designed for managing those deployment assignments.
Recommend we build the triple tagging system discussed before to support folksonomy and assignment workflows. Add to bundle versions, resources and deployment defs. See Design - Tagging for details.
environment=[dev,qa,stag,perf,prod,dr]
qaStatus=[untested,tested,confirmedForProduction]
department=[consumerBanking, comercialBanking, investmentBanking, hr, it]
project=[SpecialExplorationProject, DataConsistencyProject, PublicWebsite, LdapServices]
When installing, the user should be walked through the expected outcome. The question response screen may have already been filled out for an environment, reuse it. Allow user to confirm file munging results (a screen with all the files to be munged and a way to preview them across the deployment (for each box)). Detailed steps to install should be exposed to the user. The ant workflows should include the results of each step. Preview the steps before it starts? Watch the results as the system progresses. (Do we have a way to say upgrade/install all servers at once or one at a time? Can we stop or continue on failure?)
Error handling workflows will mostly involve collecting and displaying logs and step status. Which step failed? What was the outcome? What was the detailed logging (from Ant)? Need a system to quickly revert to a previous version on failure to deploy the next version.
Execution features
Install a bundle to a new directory
Check install status and report diffs to server
Pre-deploy check allowing you see what the files will look like after property replacement
Install a bundle over top an existing install (upgrading and backing up as described above)
Install via Ant can be run offline and disconnected from the server
Install steps supported via Ant tasks
Shutdown
Check service status
Check deploy file statuses
Install or Upgrade install directory
Inventory of install
install / upgrade / remove OS service
Input properties (via file or interactive QA)
File property replacement
Startup
CLI
Upload bundle package with content and ant recipe (new bundle or new version)
deploy to resource or group
GUI
Bundle / Bundle version upload via bundle packages
Setup deployments w/ tagging
Resource / Resource Group tagging and tagging display in deployment screens
Choosing resource or group for new deployment
Choosing a new version / deployment def for a previously deployed resource / deployment combination
Deployment execution
Preview of changes and status
Preview of steps to install (targets & tasks)
Review of inventory file status report (e.g. file was changed since last deploy)
Execution status (across groups, all together or one at a time)
Detailed step review including error status and log messages
Resource / Deployment relationship
Bundle upload via URL (give the wizard a URL and have it suck)
Other
Resource to deployment linking through install inventory metadata and discovery utilities (custom resource relationships or fields?)
Resource tagging management
Content Providers that can automatically pull release builds from external build management systems or SCM's